MBGapp: A Shiny application for teaching model-based geostatistics to population health scientists

User-friendly interfaces have been increasingly used to facilitate the learning of advanced statistical methodology, especially for students with only minimal statistical training. In this paper, we illustrate the use of MBGapp for teaching geostatistical analysis to population health scientists. Using a case-study on Loa loa infections, we show how MBGapp can be used to teach the different stages of a geostatistical analysis in a more interactive fashion. For wider accessibility and usability, MBGapp is available as an R package and as a Shiny web-application that can be freely accessed on any web browser. In addition to MBGapp, we also present an auxiliary Shiny app, called VariagramApp, that can be used to aid the teaching of Gaussian processes in one and two dimensions using simulations.

The variogramApp is a separate web application that can be used to assist the students' learning of the the theory of Gaussian processes and how to interpret variogram plots. In particular, it allows users to visualize the spatial patterns and smoothness of Gaussian processes that arises from a given covariance function and set of parameters, and how these are reflected in the variogram. The app provides simulation tools both in one and two dimensions. To access this variogram app in R, the developmental version on GitHub can be run by entering shiny::runGitHub(repo="variogramApp", username= "olatunjijohnson", ref="master", subdir = "inst/variogramApp") or by installing the variogramApp using devtools::install_github("olatunjijohnson/variogramApp", ref="master").
After installing the app, it can loaded in R using
The main area of the app displays two figures. One, a simulated surface of S(x) or S(x) + Z(x), where S(x) is a zero-mean isotropic and stationary Gaussian process with Cov(S(x), S(x )) = σ 2 ρ(u, θ), where σ 2 is the variance, ρ(u, θ) is the correlation function with parameter θ and u is the Euclidean distance between location x and x ; and Z(x) is independent and identically distributed Gaussian noise with nugget variance τ 2 . Two, the semi-variogram plot describing the spatial dependence in the data and giving a sense of the estimate of the autocorrelation structure of the underlying stochastic process.
The following is the description of the menus in the sidebars of the app.

Correlation functions:
A number of correlation functions are provided in the app. Let θ = (φ, κ) denote the vector of the correlation parameters, where φ is the scale parameter, κ is the smooth parameter. The correlation functions are as follows: • Exponential: • Matern: where K κ (·) is the modified Bessel function of the second kind of order κ > 0.
• Powered exponential: • Cauchy: • Gneiting: where k is a constant value. This model corresponds to no spatial correlation.

Dimension:
The user has the option of simulating the Gaussian process in one or two dimension.
3. Domain: Depending on the dimension selected, we provided the options for the users to choose the domain of interest. For example, in one-dimension, the user can choose one of 500, 1000 and 1500 time points while in two-dimension, if the user select a 150 by 100 domain, this implies a x-axis ranging from 0 to 150 and y-axis ranging from 0 to 100.
4. Variance parameter, σ 2 : The variance parameter must be greater than zero.
5. Scale parameter, σ 2 : The scale parameter regulates the rate at which the spatial correlation decays for increasing distance u.
6. Nugget variance, τ 2 : The variance of the nugget effect can be specified here if the user want to include a nugget in the simulation. Setting the nugget variance to zero implies no nugget effect.

7.
Smoothness parameter, κ: The shape or smoothness parameter determines the differentiability of the process S(x).

Sampling method:
We provides two methods of sampling; random and regular sampling. Random sampling select locations in a random manner while regular sampling imposes a minimum distance between any two sampled locations, therefore introducing minimum distance parameter δ for setting this distance. 11. Number of bins: This is the number of classes to average the empirical variograms.
12. Number of simulation: This is used by the user to perform this experiment multiple times. If the number of simulations is more than 1 then there is an option to play the result.
13. Perform Simulation: The button performs the operation.